Arbitrary Function of a 2×2 Hermitian Matrix

With sufficient time spent on quantum mechanics, one invariably comes across the formula for the exponential of the Pauli matrices

$σ_{1} = (\begin{array}{c} 0 & 1 \\ 1 & 0 \end{array}) σ_{2} = (\begin{array}{c} 0 & - i \\ i & 0 \end{array}) σ_{3} = (\begin{array}{c} 1 & 0 \\ 0 & - 1 \end{array})$

If these are supplemented by an identity matrix, they can be used to represent a general 2×2 Hermitian matrix as

$M = [\begin{matrix} a_{0} + a_{3} & a_{1} - i a_{2} \\ a_{1} + i a_{2} & a_{0} - a_{3} \end{matrix}] = a_{0} + σ \cdot a$

where the quantities a_k are all real. Hermitian matrices are unchanged by simultaneous transposition and complex conjugation of their elements. They are important in quantum mechanics because their eigenvalues are always real.

The exponentiation formula for this matrix can be written

$e^{M} = e^{a_{0}} [cosh \sqrt{a \cdot a} + (σ \cdot a) \frac{sinh \sqrt{a \cdot a}}{\sqrt{a \cdot a}}]$

The linear combination inside the brackets is easy to verify using properties of the Pauli matrices,

$\begin{array}{l} σ_{1} σ_{2} = - σ_{2} σ_{1} \\ σ_{2} σ_{3} = - σ_{3} σ_{2} \\ σ_{3} σ_{1} = - σ_{1} σ_{3} \end{array} σ_{1}^{2} = σ_{2}^{2} = σ_{3}^{2} = 1$

so that the square of a traceless Pauli matrix has the simple form

$(σ \cdot a)^{2} = \sum_{k, m = 1}^{3} a_{k} a_{m} σ_{k} σ_{m} = \sum_{k = 1}^{3} a_{k}^{2} = a \cdot a$

For an exponential function, the traceful part of the Hermitian matrix is simply a separate factor. Then the even terms of the exponential form the hyperbolic cosine and the odd terms the hyperbolic sine divided by the norm of the real vector.

For functions other than the exponential, determining the form of $f (M)$ with matrix multiplication is not as immediately straightforward. A more convenient approach is to diagonalize the matrix, apply the function to its eigenvalues and then invert the diagonalization. The final result is pleasingly simple and applies to all analytic functions.

The process of diagonalizing a matrix is described by a similarity transform

$M_{diagonal} = S^{- 1} M S$

where the columns of the invertible matrix S are the normalized eigenvectors of the diagonalizable matrix. This statement can be rearranged to

$M = S M_{diagonal} S^{- 1}$

An arbitrary power of the matrix is built up from products of the entire right-hand side, with adjacent products of the matrix S and its inverse cancelling to unity. The simple final result is

$M^{k} = S M_{diagonal}^{k} S^{- 1}$

where the powers of a diagonal matrix are evaluated as powers of the eigenvalues along the diagonal. For any function expressible as a power series one then has

$f (M) = \sum_{k = 0}^{\infty} c_{k} M^{k} = \sum_{k = 0}^{\infty} c_{k} S M_{diagonal}^{k} S^{- 1} = S f (M_{diagonal}) S^{- 1}$

Apply this to the 2×2 Hermitian matrix. The eigenvalues are $a_{0} \pm \sqrt{a \cdot a}$ and the corresponding normalized eigenvectors are

$\frac{1}{n_{\pm}} [a_{1} - i a_{2}, - a_{3} \pm \sqrt{a \cdot a}] n_{\pm} = \sqrt{a_{1}^{2} + a_{2}^{2} + (a_{3} \mp \sqrt{a \cdot a})^{2}}$

The diagonalizing matrix and its inverse are

$S = [\begin{array}{c} \frac{a_{1} - i a_{2}}{n_{+}} & \frac{a_{1} - i a_{2}}{n_{-}} \\ \frac{- a_{3} + \sqrt{a \cdot a}}{n_{+}} & \frac{- a_{3} - \sqrt{a \cdot a}}{n_{-}} \end{array}] S^{- 1} = [\begin{array}{c} \frac{a_{1} + i a_{2}}{n_{+}} & \frac{- a_{3} + \sqrt{a \cdot a}}{n_{+}} \\ \frac{a_{1} + i a_{2}}{n_{-}} & \frac{- a_{3} - \sqrt{a \cdot a}}{n_{-}} \end{array}]$

The denominators appearing in $S f (M_{diagonal}) S^{- 1}$ will all be squared products of the normalizing factors, which can be written

$\frac{1}{n_{\pm}^{2}} = \frac{1}{2 (a \cdot a) \mp 2 a_{3} \sqrt{a \cdot a}} = \frac{\sqrt{a \cdot a} \pm a_{3}}{2 \sqrt{a \cdot a} (a_{1}^{2} + a_{2}^{2})}$

and it now becomes straightforward to evaluate matrix elements in

$f (M) = S [\begin{array}{c} f (a_{0} + \sqrt{a \cdot a}) & 0 \\ 0 & f (a_{0} - \sqrt{a \cdot a}) \end{array}] S^{- 1}$

The individual results are

$\begin{array}{l} f (M)_{11} = (a_{1}^{2} + a_{2}^{2}) [\frac{f (a_{0} + \sqrt{a \cdot a})}{n_{+}^{2}} + \frac{f (a_{0} - \sqrt{a \cdot a})}{n_{-}^{2}}] \\ = \frac{1}{2} [f (a_{0} + \sqrt{a \cdot a}) + f (a_{0} - \sqrt{a \cdot a})] \\ + \frac{a_{3}}{2 \sqrt{a \cdot a}} [f (a_{0} + \sqrt{a \cdot a}) - f (a_{0} - \sqrt{a \cdot a})] \\ f (M)_{12} = (a_{1} - i a_{2}) [\frac{f (a_{0} + \sqrt{a \cdot a}) (- a_{3} + \sqrt{a \cdot a})}{n_{+}^{2}} \\ + \frac{f (a_{0} - \sqrt{a \cdot a}) (- a_{3} - \sqrt{a \cdot a})}{n_{-}^{2}}] \\ = \frac{(a_{1} - i a_{2})}{2 \sqrt{a \cdot a}} [f (a_{0} + \sqrt{a \cdot a}) - f (a_{0} - \sqrt{a \cdot a})] \\ f (M)_{21} = (a_{1} + i a_{2}) [\frac{f (a_{0} + \sqrt{a \cdot a}) (- a_{3} + \sqrt{a \cdot a})}{n_{+}^{2}} \\ + \frac{f (a_{0} - \sqrt{a \cdot a}) (- a_{3} - \sqrt{a \cdot a})}{n_{-}^{2}}] \\ = \frac{(a_{1} + i a_{2})}{2 \sqrt{a \cdot a}} [f (a_{0} + \sqrt{a \cdot a}) - f (a_{0} - \sqrt{a \cdot a})] \\ f (M)_{22} = \frac{f (a_{0} + \sqrt{a \cdot a}) (\sqrt{a \cdot a} - a_{3})^{2}}{n_{+}^{2}} + \frac{f (a_{0} - \sqrt{a \cdot a}) (\sqrt{a \cdot a} + a_{3})^{2}}{n_{-}^{2}} \\ = \frac{1}{2} [f (a_{0} + \sqrt{a \cdot a}) + f (a_{0} - \sqrt{a \cdot a})] \\ - \frac{a_{3}}{2 \sqrt{a \cdot a}} [f (a_{0} + \sqrt{a \cdot a}) - f (a_{0} - \sqrt{a \cdot a})] \end{array}$

which can be collected in Pauli matrix notation as

$\begin{array}{l} f (M) = \frac{1}{2} [f (a_{0} + \sqrt{a \cdot a}) + f (a_{0} - \sqrt{a \cdot a})] \\ + \frac{(σ \cdot a)}{2 \sqrt{a \cdot a}} [f (a_{0} + \sqrt{a \cdot a}) - f (a_{0} - \sqrt{a \cdot a})] \end{array}$

This is the pleasingly simple result promised. The formula for the exponential of the matrix follows immediately from recognizing that the hyperbolic cosine is half the sum of an exponential and its inverse, while the hyperbolic sine is half their difference.

As another check of the simple result, directly evaluate an arbitrary power of the Hermitian matrix using the binomial theorem:

$\begin{array}{l} M^{k} = (a_{0} + σ \cdot a)^{k} = \sum_{m = 0}^{k} (\binom{k}{m}) a_{0}^{m} (σ \cdot a)^{k - m} \\ = \sum_{m = 0}^{⌊ \frac{k}{2} ⌋} (\binom{k}{2 m}) a_{0}^{k - 2 m} (a \cdot a)^{m} + \sum_{m = 0}^{⌊ \frac{k - 1}{2} ⌋} (\binom{k}{2 m + 1}) a_{0}^{k - 2 m - 1} (a \cdot a)^{m} (σ \cdot a) \end{array}$

The corresponding statement from the pleasingly simple result is

$\begin{array}{l} M^{k} = \frac{1}{2} [(a_{0} + \sqrt{a \cdot a})^{k} + (a_{0} - \sqrt{a \cdot a})^{k}] \\ + \frac{(σ \cdot a)}{2 \sqrt{a \cdot a}} [(a_{0} + \sqrt{a \cdot a})^{k} - (a_{0} - \sqrt{a \cdot a})^{k}] \end{array}$

When the powers on the right-hand side are expanded, subtractive cancellation will leave behind only even powers of k for the first bracketed terms and only odd powers of k for the second bracketed terms. The final summations will be exactly the same as those in the direct evaluation.

In retrospect the equivalence of these two forms for an arbitrary power of the Hermitian matrix is completely understandable, yet perhaps not completely obvious. Intuiting this second form for an arbitrary power would of course lead to the pleasingly simple result without any matrix diagonalization. Then again, if the intuiting were that obvious this simple result would appear in all quantum mechanics texts, wouldn’t it?